Extracting and Querying Relations in Scientific Papers on Language Technology

نویسندگان

  • Ulrich Schäfer
  • Hans Uszkoreit
  • Christian Federmann
  • Torsten Marek
  • Yajing Zhang
چکیده

We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the ACL Anthology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domainspecific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting ‘quriples’ are stored in a database from where they can be retrieved (again using abstraction methods) by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist’s Workbench. It supports researchers in editing and online-searching scientific papers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting and Querying Relations in Scientific Papers on Language Technology pdfsubject

We describe methods for extracting interesting factual relations from scientific texts in computational linguistics and language technology taken from the ACL Anthology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domainspecific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The ex...

متن کامل

Natural Language Processing for Intelligent Access to Scientific Information

During the last decade the amount of scientific information available on-line increased at an unprecedented rate. As a consequence, nowadays researchers are overwhelmed by an enormous and continuously growing number of articles to consider when they perform research activities like the exploration of advances in specific topics, peer reviewing, writing and evaluation of proposals. Natural Langu...

متن کامل

رویکردی با ناظر در استخراج واژگان کلیدی اسناد فارسی با استفاده از زنجیره‌های لغوی

Keywords are the main focal points of interest within a text, which intends to represent the principal concepts outlined in the document. Determining the keywords using traditional methods is a time consuming process and requires specialized knowledge of the subject. For the purposes of indexing the vast expanse of electronic documents, it is important to automate the keyword extraction task. S...

متن کامل

Fuzzy Data Mining for Querying and Retrieval of Research Archival Information

This paper discusses the design of a prototype information/intelligent system (FUZZYBASE) to facilitate intelligent and fast retrieval of information that is of interest to scientific research communities with specific needs, such as getting RELEVANT technical information FAST. It involves the intelligent fuzzy retrieval of information, crisp retrieval of cataloged information, high end compute...

متن کامل

Extracting Definitions of Mathematical Expressions in Scientific Papers

Natural language definitions of mathematical expressions are essential for understanding the mathematical content of scientific papers. A textual description corresponding to a mathematical expression determines the type of symbol or function and the specific name for reference. Our objective is to create an automatic way of extracting definitions of mathematical expressions. We needed to creat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008